REMARKS 



Claims 1-13 were pending in the application. By this 
amendment, Claim 13 is canceled and amendments are 
presented for Claims 1, 7, 8 and 12. Claims 1-12 are now 
pending . 

The Examiner has rejected Claims 12-13 under 35 USC 
101 as directed to non-statutory subject matter. 
Applicants have amended Claim 12 and have canceled Claim 
13. The Examiner has rejected Claims 1-13 (now 1-12) under 
35 USC 102(b) as anticipated by Yamaguchi . For the reasons 
set forth below, Applicants believe that the claims, as 
amended, are patentable over the cited art. 

The present application teaches and claims a system 
and method for speech recognition for recognizing original 
speech even when the original speech is superimposed with 
an echo generated by the environment. The present 
application expressly teaches that "in consideration of the 
long impulse response, an echo can be sufficiently 
simulated even if the echo is assumed to be superimposed 
onto a speech signal 0 (go, t) to be determined at the 
current time point while being dependent on a speech signal 
0 (co, tp) in the immediately previous frame. That is, by 
using the formula (2) above to determine acoustic model 



JP920030128US1 



9 



data with the highest likelihood for a speech signal from a 
predetermined acoustic model data and the value of a (i.e., 
the echo prediction coefficient) , it is possible to use a 
corresponding language model data to perform speech 
recognition using only a speech signal from one channel" 
(page 25, lines 6-15) . The present application further 
teaches that "speech input as a reference signal is not 
required" (page 11, lines 18-19) . Accordingly, in contrast 
to prior art speech recognition systems which require 
multiple model training iterations and/or multiple input 
channels, the present invention can dynamically calculate 
an echo prediction coefficient to generate echo speech 
model data for generating adapted acoustic model data. 

The Yamaguchi patent is directed to a scheme for model 
adaptation in pattern recognition based on Taylor 
expansion. Specifically, the Yamaguchi patent detects a 
change in a parameter representing a condition of pattern 
recognition (e.g., noise during speech recognition) based 
on a change in that condition between when the model was 
created and when the pattern recognition using the model is 
being done. For example, when speech recording conditions 
differ at the time of actual recognition (Col. 1, lines 29- 
33) then the Yamaguchi system calculate the difference 
between the original (i.e., at time of creation of the 
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"initial noisy speech") and current (i.e., at time of 
recognition) conditions (Col. 11, lines 41-45) and 
generates an adapted noisy speech model to be used for 
recognizing speech under the current conditions. Yamaguchi 
teaches "[i]n the model adaptation apparatus of Fig. 3, 
first at a time of model training, the initial noise HMM is 
obtained from the background noise that is entered at a 
speech input unit 1 and extracted at a noise extraction 
unit 2" (see: Col. 11, lines 23-26). Accordingly, 
Yamaguchi requires both an earlier acquired control noise 
signal and an earlier acquired speech input signal to 
create the models. Thereafter, Yamaguchi adjusts the 
models based on new noise input. 

Applicants respectfully assert that the Yamaguchi patent 
does not anticipate the invention as claimed. With 
specific reference to the claim language, Yamaguchi does 
not teach a speech recognition steps and means for storing 
a feature quantity acquired from a current speech signal 
for each frame; storing acoustic model data and language 
model data, respectively; an echo adaptation model 
generating portion for generating echo speech model data 
from a speech signal acquired immediately prior to a 
current speech signal to be processed at the current time 
point and using the echo speech model data to generate 
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adapted acoustic model data; and recognition processing 
means for utilizing said feature quantity, said adapted 
acoustic model data and said language model data to provide 
a speech recognition result of the speech signal. 

With respect to the claim feature of steps and means for 
storing a feature quantity acquired from a current speech 
signal for each frame, the Examiner cites Fig. 3 of 
Yamaguchi and states that "figure 3 includes a buffer 
memory for temporarily storing the received speech signal 
for processing." Applicants respectfully state that there 
is no illustrated buffer memory. The Yamaguchi Fig. 3 
shows storage locations for noise, clean speech, noisy 
speech and Jacobian matrix memory units. Yamaguchi does 
not teach or illustrate a buffer memory. Moreover, the 
illustrated storage locations for Yamaguchi are not 
provided for storing a feature quantity acquired from a 
speech signal. 

With respect to the claim feature of steps and means for 
storing acoustic model data and language model data, 
respectively, Applicants acknowledge that Yamaguchi stores 
noise, clean speech, and noisy speech. That is not the 
same as or suggestive of storing acoustic model data and 
language model data. The Examiner states that "language 
model or grammar or dictionary is inherently included in 
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the speech recognizer 13" of figure 3. Applicants 
disagree. There are speech recognizers that do not include 
language those components. Applicants remind the Examiner 
that an anticipation rejection requires that each and every 
claim feature be taught by the reference. See: In re 
Schreiber , 128 F. 3d 1473, 1477, 44 USPQ2d 1429, 1431 (Fed. 
Cir. 1997); In re Paulsen , 30 F. 3d 1475, 1478-1479, 31 
USPQ2d 1671, 1673 (Fed. Cir. 1994); In re Spada, 911 F. 2d 
705, 708, 15 USPQ2d 1655, 1657 (Fed. Cir. 1990) and RCA 
Corp. v. Applied Digital Data Sys . , Inc. , 730 F. 2d 1440, 
1444, 221 USPQ 385, 388 (Fed. Cir. 1984) . Further, even 
for a sustainable obviousness rejection, the Federal 
Circuit has stated that the obviousness determination "must 
be based on objective evidence of record". ( In re Lee , 277 
F. 3d 1338, 1343 (Fed. Cir. 2002)) and that "conclusory 
statements" by an examiner fail to adequately address the 
factual question of motivation, which is material to 
patentability and cannot be resolved "on subjective belief 
and unknown authority" ( Id. at 1343-1344) . Accordingly, 
the Examiner cannot simply conclude that the Yamaguchi 
speech recognizer has components that are not expressly 
taught . 

With respect to the claim feature of the steps and echo 
adaptation model generating portion for generating echo 
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speech model data from a speech signal acquired immediately 
prior to a current speech signal to be processed at the 
current time point and using the echo speech model data to 
generate adapted acoustic model data, the Examiner 
concludes that Figure 3 shows "Updating HMM" . Applicants 
note that Figure 3 does not show updating of the HMM and 
does not illustrate an echo adaptation model generating 
portion as claimed. Applicants disagree with the 
Examiner's conclusion that "echo" and "noise" are 
synonymous. The term "echo" is defined in the present 
specification (page 10, line 19-page 11, line 3) as "an 
acoustic signal which gives influence longer than the time 
width of the observation window." Applicants disagree with 
the Examiner's statement that "any noise source that is 
generated by the environment is considered an echo signal". 
An echo has a unique relationship to input speech, which 
does not exist with other environmental noise. The present 
invention provides a way to determine an absolute value of 
an echo (see: page 21, line 19 through page 22, line 2) 
dynamically and to perform more accurate speech recognition 
when an echo is present. 



JP920030128US1 



14 



Finally, the Examiner concludes that Yamaguchi's Fig. 
3 shows a recognition processing means for utilizing said 
feature quantity, said adapted acoustic model data and said 
language model data to provide a speech recognition result 
of the speech signal as claimed. Applicants reiterate that 
Yamaguchi does not teach or suggest the storing of feature 
quantities from current speech signals or an adapted 
acoustic model generated using the echo adaptation model 
generating portion operating on current speech signals. 
Further, Yamaguchi's Fig. 3 provides no teaching or 
illustration of speech recognition using those features and 
components . 

Anticipation under 35 USC 102 is established only when 
a single prior art reference discloses each and every 
element of a claimed invention. See: In re Schreiber , 128 
F. 3d 1473, 1477, 44 USPQ2d 1429, 1431 (Fed. Cir. 1997); In 
re Paulsen , 30 F. 3d 1475, 1478-1479, 31 USPQ2d 1671, 1673 
(Fed. Cir. 1994); In re Spada, 911 F. 2d 705, 708, 15 
USPQ2d 1655, 1657 (Fed. Cir. 1990) and RCA Corp. v. Applied 
Digital Data Sys., Inc. , 730 F. 2d 1440, 1444, 221 USPQ 
385, 388 (Fed. Cir. 1984) . Since the Yamaguchi patent 
reference does not teach all of the claimed features, as 
outlined above, it cannot be concluded that Yamaguchi 
anticipates the invention as claimed. 
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Based on the foregoing amendments and remarks, 
Applicants respectfully request entry of the amendment, 
reconsideration of the rejections, and issuance of the 
claims . 

Respectfully submitted, 
Takiguchi, et al 

By: /Anne Vachon Dougherty/ 
Anne Vachon Dougherty 
Reg. No. 30,374 
Tel. (914) 962-5910 
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